Code of the statistical analysis for the article ‘Life Beyond A Jar: Effects of Tank Size and Enrichment on the Behaviour and Welfare of Siamese Fighting Fishes’

Author

Juliette Tariel-Adam

Published

June 5, 2024

The script contains the code of data transformation, analysis and graphs for the article “Life Beyond A Jar: Effects of Tank Size and Enrichment on the Behaviour and Welfare of Siamese Fighting Fishes” by XXauthorsXXX. It is in two versions: the raw R script in qmd format and the code output in html format.

The goal was to test the effect of tank enrichment and size on Siamese fish behaviour. The tanks Jar, Small, Medium and Large are compared among each other, and the tank Large and Barren between each other. Barren is not compared to Jar, Small or Medium because they differ by two factors: enrichment and size.

On the html version of the code, you fill find below a first row of tabs to click on (PCA, Swimming, etc) and for most of these tabs, a second row of tabs to click on (Distribution, Plot, etc).

After staying 3-7 days in a tank, the behaviour of all fish were scored by four 10-min trials in their tank. All trials occurred on the same day at 4 different times (7 am, 10 am, 2 pm, 6 pm). During the 10 min trial = 600 sec, the fish’s behaviour was assigned to one of the 10 behaviours, meaning that the behaviours are mutually exclusive (for instance, a fish can’t be scored as nest building and interacting with the surface at the same time): Resting, Swimming, Hovering, Sinking/Floating, Stereotypic swimming, Nest building, Foraging, Interation with surface, Out of view, Unsure. Unsure is not present in the raw dataset. Out.of.view is a bit special because it is not a behaviour and was not analysed. Sinking/Floating was not analysed (see explanation in the PCA section).

The data has been inspected and corrected for data entry errors (sum of scored behaviours too low/high, column shifted, typos, inconsistent nb of observations per fish/time/tank…).

Each row represents a trial. For each trial, there is: + Fish: the fish tested + Tank: in which tank the fish was + Order: the tank order, i.e. whether it was the first, second, …, fifth tank that the fish experienced. + Time: time of the day at which the trial happened + Filter: where there a filter in the tank + Behavioural columns. The columns for the behaviours are the time spent doing the specific behaviour in seconds during the trial. Some behaviours (Forgaging, Hovering, Interaction.with.Surface, Stereotypic.swimming) have been trasnformed in binary variable (.bin). The columns .bins represents if the fish performed the specific behaviour at all during the trial or not.

summary(data)
      Fish         Tank    Order        Time          Filter       Swimming    
 Elf    : 20   Jar   :48   1:52   7:00 AM :63   No-filter:180   Min.   :  0.0  
 Fairy  : 20   Small :52   2:52   10:00 AM:63   Filter   : 72   1st Qu.:133.9  
 Ghost  : 20   Medium:52   3:48   2:00 PM :63                   Median :224.8  
 Goblin : 20   Large :52   4:52   6:00 PM :63                   Mean   :224.6  
 Kinara : 20   Barren:48   5:48                                 3rd Qu.:309.2  
 Kraken : 20                                                    Max.   :486.8  
 (Other):132                                                                   
    Resting          Hovering      Stereotypic.swimming Nest.building   
 Min.   :  0.00   Min.   :  0.00   Min.   :  0.00       Min.   :  0.00  
 1st Qu.: 92.33   1st Qu.:  6.01   1st Qu.:  0.00       1st Qu.:  0.00  
 Median :190.99   Median : 23.71   Median :  0.00       Median :  0.00  
 Mean   :226.04   Mean   : 42.93   Mean   : 42.47       Mean   : 33.72  
 3rd Qu.:354.97   3rd Qu.: 56.21   3rd Qu.: 20.58       3rd Qu.:  0.00  
 Max.   :600.00   Max.   :344.27   Max.   :549.00       Max.   :512.99  
                                                                        
    Foraging      Interation.with.surface Foraging.bin SS.bin    Hovering.bin
 Min.   :  0.00   Min.   :  0.000         No :168      No :172   No : 48     
 1st Qu.:  0.00   1st Qu.:  0.000         Yes: 84      Yes: 80   Yes:204     
 Median :  0.00   Median :  1.545                                            
 Mean   :  7.83   Mean   : 15.331                                            
 3rd Qu.:  7.80   3rd Qu.: 14.117                                            
 Max.   :125.28   Max.   :251.000                                            
                                                                             
 Interacting.bin Nest.bin 
 No :125         No :201  
 Yes:127         Yes: 51  
                          
                          
                          
                          
                          
# Number of observations per fish per tank
data %>% group_by(Fish, Tank) %>%
  summarise(n=n(), .groups = 'drop') %>%
  spread(Tank,n) 
# A tibble: 13 × 6
   Fish            Jar Small Medium Large Barren
   <fct>         <int> <int>  <int> <int>  <int>
 1 Dragon           NA     4      4     4      4
 2 Elf               4     4      4     4      4
 3 Fairy             4     4      4     4      4
 4 Ghost             4     4      4     4      4
 5 Goblin            4     4      4     4      4
 6 Kinara            4     4      4     4      4
 7 Kraken            4     4      4     4      4
 8 Merlion           4     4      4     4     NA
 9 Naga              4     4      4     4      4
10 Orange Pendek     4     4      4     4      4
11 Pegasus           4     4      4     4      4
12 Phoenix           4     4      4     4      4
13 Wizard            4     4      4     4      4
# Number of observations per fish per time
data %>% group_by(Fish, Time) %>%
  summarise(n=n(), .groups = 'drop') %>%
  spread(Time,n) 
# A tibble: 13 × 5
   Fish          `7:00 AM` `10:00 AM` `2:00 PM` `6:00 PM`
   <fct>             <int>      <int>     <int>     <int>
 1 Dragon                4          4         4         4
 2 Elf                   5          5         5         5
 3 Fairy                 5          5         5         5
 4 Ghost                 5          5         5         5
 5 Goblin                5          5         5         5
 6 Kinara                5          5         5         5
 7 Kraken                5          5         5         5
 8 Merlion               4          4         4         4
 9 Naga                  5          5         5         5
10 Orange Pendek         5          5         5         5
11 Pegasus               5          5         5         5
12 Phoenix               5          5         5         5
13 Wizard                5          5         5         5
# Sum of all behaviours per trial
data3$total <- rowSums(dplyr::select(data3, beh.cols3)) 
sort(data3$total) 
  [1] 556.750 557.010 559.290 571.820 575.700 583.180 584.820 586.990 588.230
 [10] 588.970 592.550 592.810 592.910 593.320 594.830 595.190 595.330 595.560
 [19] 595.760 595.990 596.190 596.270 596.290 596.500 596.600 596.890 596.910
 [28] 597.288 597.370 597.490 597.610 597.700 597.770 597.830 597.890 598.070
 [37] 598.350 598.470 598.480 598.500 598.640 598.730 598.750 599.060 599.560
 [46] 599.580 599.640 599.720 599.810 600.000 600.000 600.040 600.090 600.290
 [55] 600.400 600.410 600.440 600.610 600.620 600.750 600.780 600.880 600.930
 [64] 601.030 601.050 601.070 601.110 601.300 601.360 601.380 601.470 601.550
 [73] 601.560 601.570 601.630 601.650 601.650 601.670 601.680 601.710 601.750
 [82] 601.770 601.810 601.820 601.820 601.860 601.870 601.960 601.970 602.050
 [91] 602.050 602.060 602.060 602.090 602.090 602.110 602.130 602.160 602.160
[100] 602.170 602.210 602.220 602.370 602.430 602.430 602.450 602.470 602.620
[109] 602.620 602.640 602.660 602.670 602.710 602.740 602.760 602.760 602.790
[118] 602.790 602.870 602.900 602.920 602.960 602.970 602.990 603.000 603.010
[127] 603.100 603.120 603.120 603.170 603.190 603.220 603.220 603.250 603.290
[136] 603.290 603.310 603.360 603.370 603.380 603.390 603.400 603.410 603.450
[145] 603.490 603.520 603.650 603.670 603.690 603.690 603.700 603.720 603.730
[154] 603.800 603.820 603.840 603.880 603.880 603.890 603.910 603.940 604.010
[163] 604.010 604.020 604.040 604.110 604.120 604.130 604.130 604.190 604.200
[172] 604.210 604.230 604.260 604.300 604.300 604.310 604.320 604.350 604.360
[181] 604.360 604.380 604.380 604.400 604.410 604.410 604.430 604.470 604.520
[190] 604.550 604.590 604.610 604.640 604.660 604.660 604.670 604.670 604.680
[199] 604.690 604.710 604.720 604.730 604.750 604.780 604.790 604.810 604.870
[208] 604.890 604.900 604.930 604.930 604.950 604.960 605.000 605.000 605.010
[217] 605.030 605.060 605.280 605.290 605.310 605.330 605.470 605.470 605.570
[226] 605.650 605.780 605.830 605.910 605.980 606.000 606.150 606.360 607.020
[235] 607.170 607.680 607.700 608.260 608.540 608.810 609.200 609.340 609.450
[244] 610.030 610.130 612.490 612.530 614.340 614.850 616.970 622.480 624.680

The sum of all behaviours is not exactly 600 sec (as Unsure is not in the dataset).

The mean value over multiple trials is plotted to only have one value per tank and per behaviour.

# nb of fish had a filter in their tank
table(data$Tank, data$Filter)/4 
        
         No-filter Filter
  Jar           12      0
  Small         13      0
  Medium         7      6
  Large          7      6
  Barren         6      6
# nb of times each fish had a filter in their tank
table(data$Fish, data$Filter)/4 
               
                No-filter Filter
  Dragon                1      3
  Elf                   5      0
  Fairy                 2      3
  Ghost                 2      3
  Goblin                5      0
  Kinara                5      0
  Kraken                2      3
  Merlion               4      0
  Naga                  5      0
  Orange Pendek         5      0
  Pegasus               5      0
  Phoenix               2      3
  Wizard                2      3
plot1("Filter", main = data3, beh_cols = beh.cols3, palette = palette.beh3)

There have never been a filter in Jar or Small. Some fish never had a filter in their tank. Filter needs to be included in all models as a control variable.

We run a PCA in order to:

  • Try to reduce the number of variables to analyse
  • Determine which behaviours were important to explain the variability between trials
  • See correlations between behaviours

The PCA did not help reduce the number of variables to analyse (see Analysis_PCA.html for more explanation) so the behaviours were analysed separately with linear models. The advantage of separate linear models compared to the PCA is the possibility to quantitatively interpret the differences between tanks.

Even if we did not use PCA for further analysis, the first 4 components are plotted below, which explained 71% of variance in the data, to show the important variables and the correlations between variables.

plot(pca, choix = "var") + labs(title = NULL)

plot(pca, choix = "var", axes = c(3,4)) + labs(title = NULL)

# Contribution to each principal component
round(pca$var$cos2,2)
                        Dim.1 Dim.2 Dim.3 Dim.4
Swimming                 0.68  0.01  0.10  0.00
Resting                  0.80  0.00  0.14  0.04
Hovering                 0.00  0.51  0.04  0.11
Stereotypic.swimming     0.06  0.52  0.13  0.23
Nest.building            0.01  0.06  0.57  0.19
Foraging                 0.32  0.01  0.16  0.18
Interation.with.surface  0.09  0.16  0.05  0.21
Sinking.Floating         0.06  0.00  0.03  0.18
# Extent to which each behavioural type contributes to the total variance explained by the first 3 components
rowSums(pca$var$contrib[,1:3])/3
               Swimming                 Resting                Hovering 
              13.946992               17.017016               14.360858 
   Stereotypic.swimming           Nest.building                Foraging 
              18.372673               17.241361               10.134302 
Interation.with.surface        Sinking.Floating 
               7.115428                1.811372 
  • The 1st PCA component was driven by Resting (cos2 = 0.8), Swimming (0.7) and Foraging (0.3). Swimming and Foraging opposite to Resting. This first component could be interpret as “activity”. If a fish spent a lot of time swimming and foraging during a trial, it was very little resting. This first component explained 25% of the variance in the data (as a reminder, the data is the time spent performing the different types of behaviour). So 1 quarter of the variance is a matter of activity. Keep in mind for the rest of the analyses that these behavioural types are correlated.

  • The 2nd PCA component was driven by Hovering (cos2 = 0.5) and Stereotypic.swimming (0.5), and a bit by Interaction with surface (0.16). Hovering and Interacting.with.Surface opposite to Stereotypic.swimming. If the fish spent a lot of time hovering (and interacting with the surface), it was very little stereotypic swimming.

  • The 3rd PCA component was driven mainly by Nest building (cos2 = 0.6), and a bit by Foraging (0.16), Resting (0.14) and Stereotypic Swimming (0.13). Foraging and Resting opposite to Nest building and Stereotypic Swimming. If a fish spent time foraging and resting in a trial, it was little nest building and stereotypic swimming. This makes sense, it a fish is building a nest,it is an important task and needs to be finished, so it is not scattering its time with other activities. The apparent correlation between nest building and stereotypic swimming could be due to Jar where fish in this tank rarely engaged in nest building or stereotypic swimming.

  • The 4th PCA component was not easily interpretable = not easily linked to the initial behavioural variables.

Sinking/Floating contributed almost nothing to the PCA (1.8%) + did not correlate strongly with any of the component. Sinking/Floating was not important to explain the variability between trials. We did not further analyse this behaviour.

hist1(data$Swimming)

The first plot is to see the trend by tank. Big red dots is for the overall mean by tank. Small black points are raw data = swimming time of each trial.

The second plot is to see the trend by fish and by tank. The dashed black line is means by tank. The other lines are the means for each fish by tank.

plot_var(data, "Swimming")

plot_var_fish(data, "Swimming")

mSwimming <- lmer(Swimming ~ Tank + Time + Filter + Order + (1|Fish), data = data)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
# Error message due to the fact that all trials with Tank = Barren are also Order = "5"
anova(mSwimming, ddf = "Kenward-Roger", type = 2)
Type II Analysis of Variance Table with Kenward-Roger's method
       Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
Tank   219905   54976     4 233.88  9.5912 3.376e-07 ***
Time   337219  112406     3 228.04 19.6133 2.393e-11 ***
Filter  71734   71734     1 152.69 12.5166 0.0005346 ***
Order   68083   22694     3 230.47  3.9598 0.0088524 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size(mSwimming)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1    jar_vs_small      -16 15.3 229   -1.07   0.852      -57       24
2   jar_vs_medium      -17 17.7 240   -0.97   0.852      -64       30
3    jar_vs_large      -92 17.4 239   -5.29  <0.001     -138      -46
4 small_vs_medium       -1 17.3 240   -0.04   0.966      -47       45
5  small_vs_large      -76 17.1 240   -4.41  <0.001     -121      -30
6 medium_vs_large      -75 15.0 228   -5.00  <0.001     -115      -35
contr_tank.enrich(mSwimming)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1 large_vs_barren       53 15.3 229    3.49  <0.001       23       84
# estimated means of time spent swimming depending on:
emmeans(mSwimming, ~ Time) # time of day
 Time     emmean   SE   df lower.CL upper.CL
 7:00 AM     298 15.1 27.2      267      329
 10:00 AM    240 15.1 27.2      209      271
 2:00 PM     204 15.1 27.2      173      235
 6:00 PM     214 15.1 27.2      183      245

Results are averaged over the levels of: Tank, Filter, Order 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
emmeans(mSwimming, ~ Filter) # filter
 Filter    emmean   SE   df lower.CL upper.CL
 No-filter    206 13.2 15.1      178      234
 Filter       272 18.0 34.1      236      309

Results are averaged over the levels of: Tank, Time, Order 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
rbind(emmeans(mSwimming, ~ Order, at = list(Tank = c("Jar", "Small", "Medium", "Large"),Order = c("1", "2", "3", "4"))),
      emmeans(mSwimming, ~ Order, at = list(Tank = c("Barren"),Order = "5"))) # tank order
 Order emmean   SE   df lower.CL upper.CL
 1        225 16.8 36.6      180      271
 2        256 15.7 31.3      213      299
 3        256 16.2 35.5      212      301
 4        213 16.3 34.2      169      258
 5        245 15.6 31.8      202      288

Results are averaged over some or all of the levels of: Tank, Time, Filter 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Conf-level adjustment: bonferroni method for 5 estimates 

Effect of Tank, Time, Filter and Order!

  • Tank. Large swam significantly more time than Jar, Small and Medium. Fish in Large swan 92 sec [CI 46, 138] more than Jar (over a 600 sec trial), xx sec [CI xx, xx] more than Small, and xx sec [CI xx, xx] more than Medium. In addition, fish in Large swam significantly more time (xx sec [CI xx, xx]) than fish in Barren.

  • Time. Fish swim more in the morning than in the afternoon, especially at 7 am.

  • Filter. Fish swim more with a filter in their tank.

  • Order. All confidence intervals are overlapping so probably no significant difference between orders. It seems that fish swim more in their 2nd and 3rd tank.

  • Effect of Barren or tank order 5? The effect size of Tank (shown by the Sum Sq) is three times more than the effect size of Order. Max difference between orders 1-4 is 43 sec. Difference between Large and Barren/5 is 53. Effect of tank is stronger on time spent swimming, so it is likely the diff between Large and Barren/5 is due (at least party) to Tank.

# Uncomment the 2 lines below if running the script for the first time
## rpt.swimming <- rpt(Swimming ~ Tank + Time + Filter + Order + (1|Fish), grname = "Fish", data = data, datatype = "Gaussian", nboot = 10000, npermut = 0)
## save(rpt.swimming, file = "Script/Repeatability rptR output/rpt.swimming")
base::load("Script/Repeatability rptR output/rpt.swimming")
summary(rpt.swimming)

Repeatability estimation using the lmm method

Call = rpt(formula = Swimming ~ Tank + Time + Filter + Order + (1 | Fish), grname = "Fish", data = data, datatype = "Gaussian", nboot = 10000, npermut = 0)

Data: 252 observations
----------------------------------------

Fish (13 groups)

Repeatability estimation overview: 
      R     SE   2.5%  97.5% P_permut  LRT_P
  0.217 0.0823 0.0613  0.377       NA      0

Bootstrapping and Permutation test: 
            N   Mean Median   2.5%  97.5%
boot    10000   0.21  0.206 0.0613  0.377
permut      1     NA     NA     NA     NA

Likelihood ratio test: 
logLik full model = -1453.809
logLik red. model = -1470.599
D  = 33.6, df = 1, P = 3.42e-09

----------------------------------------
print(VarCorr(mSwimming),comp=c("Variance"), digits=2)
 Groups   Name        Variance
 Fish     (Intercept) 1590    
 Residual             5731    

Likelihood ratio test indicates a statistically significant random effect. Meaning that fish behaved differently among each others in terms of swimming. But the repeatability is quite low (0.2 [95% CI 0.06, 0.37]).

lm_diagnosis(mSwimming)
[1] "check constant variance over fitted values"

[1] "Normality of Swimming"

[1] "Normality random residuals"

The model seems good.

hist1(data$Resting)

table(data$Resting==0)

FALSE  TRUE 
  241    11 

Only 11 trials where the fish did not rest at all.

plot_var(data, "Resting")

plot_var_fish(data, "Resting")

mResting <- lmer(Resting ~ Tank + Time + Filter + Order + (1|Fish), data = data)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
anova(mResting, ddf = "Kenward-Roger", type = 2)
Type II Analysis of Variance Table with Kenward-Roger's method
       Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
Tank   493242  123311     4 232.15  9.9054 2.041e-07 ***
Time   795993  265331     3 228.01 21.3145 3.308e-12 ***
Filter      1       1     1 224.03  0.0001 0.9936214    
Order  218417   72806     3 229.40  5.8486 0.0007258 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size(mResting)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1    jar_vs_small       50 22.6 228    2.19   0.089      -11      110
2   jar_vs_medium        5 26.4 236    0.20   0.842      -65       75
3    jar_vs_large      109 25.9 236    4.21  <0.001       40      178
4 small_vs_medium      -44 25.8 237   -1.72   0.174     -113       24
5  small_vs_large       60 25.5 237    2.34   0.081       -8      128
6 medium_vs_large      104 22.1 228    4.70  <0.001       45      163
contr_tank.enrich(mResting)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1 large_vs_barren        4 22.5 228    0.17   0.865      -41       48
emmeans(mResting, ~ Time)
 Time     emmean   SE   df lower.CL upper.CL
 7:00 AM     146 29.9 18.5     82.9      208
 10:00 AM    203 29.9 18.5    140.7      266
 2:00 PM     263 29.9 18.5    200.9      326
 6:00 PM     291 29.9 18.5    228.5      354

Results are averaged over the levels of: Tank, Filter, Order 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
rbind(emmeans(mResting, ~ Order, at = list(Tank = c("Jar", "Small", "Medium", "Large"),Order = c("1", "2", "3", "4"))),
      emmeans(mResting, ~ Order, at = list(Tank = c("Barren"),Order = "5")))
 Order emmean   SE   df lower.CL upper.CL
 1        251 31.8 23.1    162.2      341
 2        204 30.4 20.0    117.8      291
 3        215 31.1 21.7    127.4      303
 4        290 31.2 21.7    202.2      378
 5        168 30.3 19.9     81.9      255

Results are averaged over some or all of the levels of: Tank, Time, Filter 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Conf-level adjustment: bonferroni method for 5 estimates 

Effect of Tank, Time and Order.

  • Tank. Large rested significantly less time than Jar and Medium (almost significant for Small). Fish in Large rested 109 sec [CI 40, 178] less time than fish in Jar (over a 600 sec trial), and xx sec [CI xx, xx] more fish in Medium. This is in line with the PCA and the results of Swimming: Fish that spent time swimming spent less time resting. Fish in Large were overall more active. No significant difference between Large and Barren.
  • Time. Fish rest less time in the morning, especially at 7 am. In line with the PCA and the results of Swimming.
  • Order. All confidence intervals are overlapping so probably no significant difference between orders. It seems that fish rest more in their 1st and 4th tank.
# Uncomment the 2 lines below if running the script for the first time
## rpt.resting <- rpt(Resting ~ Tank + Time + Filter + Order + (1|Fish), grname = "Fish", data = data, datatype = "Gaussian", nboot = 10000, npermut = 0)
## save(rpt.resting, file = "Script/Repeatability rptR output/rpt.resting")
base::load("Script/Repeatability rptR output/rpt.resting")
summary(rpt.resting)

Repeatability estimation using the lmm method

Call = rpt(formula = Resting ~ Tank + Time + Filter + Order + (1 | Fish), grname = "Fish", data = data, datatype = "Gaussian", nboot = 10000, npermut = 0)

Data: 252 observations
----------------------------------------

Fish (13 groups)

Repeatability estimation overview: 
      R     SE   2.5%  97.5% P_permut  LRT_P
  0.406  0.106  0.177  0.588       NA      0

Bootstrapping and Permutation test: 
            N   Mean Median   2.5%  97.5%
boot    10000   0.39  0.392  0.177  0.588
permut      1     NA     NA     NA     NA

Likelihood ratio test: 
logLik full model = -1556.775
logLik red. model = -1595.284
D  = 77, df = 1, P = 8.48e-19

----------------------------------------
print(VarCorr(mResting),comp=c("Variance"), digits=2)
 Groups   Name        Variance
 Fish     (Intercept)  8519   
 Residual             12448   

Likelihood ratio test indicates a statistically significant random effect. Meaning that fish behaved differently among each others in terms of swimming. The repeatability is good 0.4 [95% CI 0.16, 0.58].

lm_diagnosis(mResting)
[1] "check constant variance over fitted values"

[1] "Normality of Resting"

[1] "Normality random residuals"

The model seems good.

Analysis only for Small, Medium and Large.

Plot

Amount of time resting at the different places depending on the tank. Each bar value is the mean of resting times over trials and fish.

From the graphs, we can see that fish use different places to rest, not only one type. They use all the different places available. It does not seem that there is a preferred resting place.

Interesting, fish avoided resting on filter. 48 trials with a filter in the tank (24 in Medium + 24 in Large) but only one trial were the fish rested on the filter for Medium and Large. Fish only rested on filter in Barren.

# Number of trials with a filter
table(data$Tank, data$Filter)
        
         No-filter Filter
  Jar           48      0
  Small         52      0
  Medium        28     24
  Large         28     24
  Barren        24     24
# Trials with resting on filter
filter(master[,c(1:5,7:8)], Resting.place == "Filter")  
     Fish   Tank Order     Time Filter Resting Resting.place
1  Dragon Barren     5  7:00 AM Filter   33.68        Filter
2  Dragon Barren     5  6:00 PM Filter   63.06        Filter
3  Dragon Barren     5  6:00 PM Filter    7.98        Filter
4  Dragon Barren     5  6:00 PM Filter   61.63        Filter
5  Dragon  Large     1 10:00 AM Filter    2.38        Filter
6  Dragon Barren     5 10:00 AM Filter   23.86        Filter
7  Dragon Barren     5 10:00 AM Filter    4.68        Filter
8  Dragon Barren     5 10:00 AM Filter   19.23        Filter
9   Ghost Barren     5  7:00 AM Filter   34.16        Filter
10  Ghost Barren     5 10:00 AM Filter  103.08        Filter
11  Ghost Barren     5 10:00 AM Filter   83.82        Filter
12  Ghost Barren     5  2:00 PM Filter   77.26        Filter
13  Ghost Barren     5  6:00 PM Filter   59.98        Filter
14  Ghost Barren     5  6:00 PM Filter   25.88        Filter
15  Ghost Barren     5  6:00 PM Filter   12.95        Filter
16  Ghost Barren     5  6:00 PM Filter  143.40        Filter
17 Kraken Barren     5 10:00 AM Filter   24.96        Filter

Analysis

mRP <- lmer(Resting ~ Tank + Time + Filter + Order + Resting.place + (1|Fish), data = data_RP)
anova(mRP, ddf = "Kenward-Roger", type = 2)
Type II Analysis of Variance Table with Kenward-Roger's method
              Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
Tank          221381  110691     2 345.38 12.3400 6.662e-06 ***
Time          204793   68264     3 363.24  7.6123 5.989e-05 ***
Filter          2844    2844     1  79.37  0.3172  0.574899    
Order         134848   44949     3 371.40  5.0123  0.002033 ** 
Resting.place 301913   75478     4 366.64  8.4167 1.664e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
emmeans(mRP, pairwise ~ Resting.place, adjust = "holm")$contrasts
 contrast         estimate   SE  df t.ratio p.value
 Floor - Surface    103.84 18.4 369   5.638  <.0001
 Floor - Plant       26.13 11.0 364   2.381  0.1244
 Floor - Barrel       3.95 18.5 366   0.214  1.0000
 Floor - Filter      81.07 97.5 365   0.832  1.0000
 Surface - Plant    -77.71 18.1 371  -4.300  0.0002
 Surface - Barrel   -99.89 23.6 370  -4.238  0.0002
 Surface - Filter   -22.77 98.7 366  -0.231  1.0000
 Plant - Barrel     -22.18 18.2 365  -1.218  1.0000
 Plant - Filter      54.94 97.4 366   0.564  1.0000
 Barrel - Filter     77.11 98.1 366   0.786  1.0000

Results are averaged over the levels of: Tank, Time, Filter, Order 
Degrees-of-freedom method: kenward-roger 
P value adjustment: holm method for 10 tests 
# With interaction
table(data_RP$Resting.place, data_RP$Tank)
         
          Small Medium Large
  Floor      44     47    44
  Surface    15      9    12
  Plant      60     61    51
  Barrel      0      0    43
  Filter      0      0     1
mRP <- lmer(Resting ~ Tank*Resting.place + Time + Filter + Order + (1|Fish), data = data_RP)
fixed-effect model matrix is rank deficient so dropping 4 columns / coefficients
anova(mRP, ddf = "Kenward-Roger", type = 2)
Type II Analysis of Variance Table with Kenward-Roger's method
                   Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
Tank               220921  110461     2 339.72 12.4591 6.002e-06 ***
Resting.place      300848   75212     4 362.80  8.4858 1.486e-06 ***
Time               187705   62568     3 359.36  7.0593 0.0001271 ***
Filter               4899    4899     1  76.06  0.5527 0.4594910    
Order              115693   38564     3 366.95  4.3509 0.0049877 ** 
Tank:Resting.place  78160   19540     4 361.71  2.2046 0.0680317 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
emmeans(mRP, pairwise ~ Resting.place) 
NOTE: Results may be misleading due to involvement in interactions
$emmeans
 Resting.place emmean   SE    df lower.CL upper.CL
 Floor         110.42 12.2  24.3     85.2    135.7
 Surface         4.05 19.6 114.8    -34.8     42.8
 Plant          83.98 11.8  20.8     59.5    108.5
 Barrel        nonEst   NA    NA       NA       NA
 Filter        nonEst   NA    NA       NA       NA

Results are averaged over the levels of: Tank, Time, Filter, Order 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

$contrasts
 contrast         estimate   SE  df t.ratio p.value
 Floor - Surface     106.4 18.7 365   5.692  <.0001
 Floor - Plant        26.4 10.9 360   2.421  0.0421
 Floor - Barrel     nonEst   NA  NA      NA      NA
 Floor - Filter     nonEst   NA  NA      NA      NA
 Surface - Plant     -79.9 18.4 367  -4.355  0.0001
 Surface - Barrel   nonEst   NA  NA      NA      NA
 Surface - Filter   nonEst   NA  NA      NA      NA
 Plant - Barrel     nonEst   NA  NA      NA      NA
 Plant - Filter     nonEst   NA  NA      NA      NA
 Barrel - Filter    nonEst   NA  NA      NA      NA

Results are averaged over the levels of: Tank, Time, Filter, Order 
Degrees-of-freedom method: kenward-roger 
P value adjustment: tukey method for comparing a family of 3 estimates 
hist1(data$Foraging)

table(data$Foraging==0)

FALSE  TRUE 
   84   168 
descdist(data$Foraging, boot = 1000, print = FALSE)

The data are clearly not following a normal distribution. It seems that a beta distribution could suit but foraging is not a continuous variable ranging from 0 to 1. Foraging has thus been analysed as a binary variable: 1 if fish spent time foraging during this trial, 0 otherwise.

There were 168 trials (out of 252) during which no foraging happened.

The plots are in percentage of trials during which the fish foraged over the total number of trials.

plot_var_binary(data, "Foraging.bin")

plot_var_binary_fish(data, "Foraging.bin")

mForaging <- glmmTMB(Foraging.bin ~ Tank + Time + Filter + Order + (1|Fish), data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
glmmTMB:::Anova.glmmTMB(mForaging)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: Foraging.bin
         Chisq Df Pr(>Chisq)    
Tank   31.4734  4  2.451e-06 ***
Time    1.8543  3     0.6032    
Filter  0.4113  1     0.5213    
Order   4.5561  3     0.2073    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size_bin(mForaging)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1    jar_vs_small       2.63 1.57 Inf    1    1.61   0.214     0.541    12.763
2   jar_vs_medium       0.38 0.24 Inf    1   -1.53   0.214     0.074     1.995
3    jar_vs_large       0.09 0.06 Inf    1   -3.71   0.001     0.015     0.493
4 small_vs_medium       0.15 0.10 Inf    1   -2.80   0.015     0.024     0.896
5  small_vs_large       0.03 0.02 Inf    1   -4.56  <0.001     0.005     0.237
6 medium_vs_large       0.23 0.11 Inf    1   -3.02   0.010     0.062     0.827
contr_tank.enrich_bin(mForaging)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1 large_vs_barren      15.75 8.76 Inf    1    4.95  <0.001     5.289    46.879
# example of differences between Jar and Large
## Contrast estimate
1/0.09
[1] 11.11111
## Upper CI
1/0.015
[1] 66.66667
## Lower CI
1/0.493
[1] 2.028398

Effect of Tank.

As a reminder, an odds.ratio < 1 between tank X vs tank Y means the probability of foraging during a trial in tank X is smaller that the probability of foraging during a trial in tank Y. An odd ratio for instance of 0.2 means that a fish is (1 / 0.2) = five times more likely to forage during a trial in tank Y than in Tank X. If we imagine 5 trials, fish will forage on average during 1 trial out of 5 in Tank X whereas fish will forage on average during 5 trials out of 5 in Tank Y.

  • Tank. Fish were more likely to forage during a trial in Large compared to Jar, Small and Medium. Again in accordance with the PCA and the results of Swimming and Resting. Fish in Large were on average 11 times [CI 2, 67] more likely to forage during a trial than fish in Jar, xx times [CI xx, xx] than fish in Small, and xx times [CI xx, xx] than fish in Medium. In addition, fish were xx times [CI xx, xx] more likely to forage during a trial in Large than Barren. In accordance with the PCA and the results of Swimming. This time, there was also a significant difference between small and medium.
rpt.foraging <- rpt(Foraging.bin ~ Tank + Time + Filter + Order + (1|Fish), grname = "Fish", data = data, datatype = "Binary", nboot = 0, npermut = 0)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
# can calculate repeatability but not the CI. I tried using bootstrap and permutation.

# Likelihood ratio test of random effect
mForaging0 <- glmmTMB(Foraging.bin ~ Tank + Time + Filter + Order, data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
anova(mForaging0, mForaging)
Data: data
Models:
mForaging0: Foraging.bin ~ Tank + Time + Filter + Order, zi=~0, disp=~1
mForaging: Foraging.bin ~ Tank + Time + Filter + Order + (1 | Fish), zi=~0, disp=~1
           Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)    
mForaging0 12 300.08 342.43 -138.04   276.08                            
mForaging  13 281.88 327.76 -127.94   255.88  20.2      1  6.976e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(VarCorr(mForaging),comp=c("Variance"), digits=2)

Conditional model:
 Groups Name        Variance
 Fish   (Intercept) 1.8     

Can calculate the repeatability but not its CI. Cannot calculate repeatability for Stereotypic swimming anyway. Hard to estimate repeatability on binary data. I don’t know how to do it without the package rptR. I think we don’t have enough data power anyway to calculate repeatability with a binary variable.

Likelihood ratio test indicates a statistically significant random effect. Meaning that fish behaved differently among each others in terms of foraging.

mForaging_res <- simulateResiduals(mForaging)
plot(mForaging_res)
Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
: Fitting terminated with step failure - check results carefully

plotResiduals(mForaging_res, form = data$Tank)

plotResiduals(mForaging_res, form = data$Time)

plotResiduals(mForaging_res, form = data$Filter)

The model seems good.

hist1(data$Stereotypic.swimming)

table(data$Stereotypic.swimming==0)

FALSE  TRUE 
   80   172 
descdist(data$Stereotypic.swimming, boot = 1000, print = FALSE)

Idem Foraging. Analysis as binary variable if the fish performs stereotypic swimming during a trial or not.

There were 172 trials (out of 252) during which no stereotypic swimming happened.

(plotSS1 <- plot_var_binary(data, "SS.bin"))

(plotSS2 <- plot_var_binary_fish(data, "SS.bin"))

Huge variability between fish as always. Some fish never performed stereotypic swimming while some in all tanks.

mSS <- glmmTMB(SS.bin ~ Tank + Time + Filter + Order + (1|Fish), data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
glmmTMB:::Anova.glmmTMB(mSS)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: SS.bin
         Chisq Df Pr(>Chisq)    
Tank   22.3290  4  0.0001723 ***
Time    3.1486  3  0.3692807    
Filter  1.7487  1  0.1860451    
Order   2.7610  3  0.4299628    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size_bin(mSS)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1    jar_vs_small       0.03 0.02 Inf    1   -4.54  <0.001     0.003     0.220
2   jar_vs_medium       0.23 0.22 Inf    1   -1.52   0.385     0.017     2.975
3    jar_vs_large       0.23 0.23 Inf    1   -1.47   0.385     0.017     3.182
4 small_vs_medium       8.43 7.49 Inf    1    2.40   0.082     0.808    87.883
5  small_vs_large       8.63 7.93 Inf    1    2.35   0.082     0.764    97.398
6 medium_vs_large       1.02 0.63 Inf    1    0.04   0.970     0.201     5.202
contr_tank.enrich_bin(mSS)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1 large_vs_barren       0.29 0.18 Inf    1   -1.96   0.051     0.081     1.003

Effect of Tank.

  • Tank. Jar were less likely to perform stereotypic swimming during a trial than Small/Medium/Large but only significant between Jar and Small. Fish in Small were xx times [CI xx, xx] more likely to perform stereotypic swimming during a trial than fish in Jar. Almost significant difference between Large and Barren: Fish in Barren were 3.4 times [CI xx, xx] more likely to perform stereotypic swimming during a trial than fish in Large.
# Attempt to calculate repeatability
rpt.SS <- rpt(SS.bin ~ Tank + Time + Filter + Order + (1|Fish), grname = "Fish", data = data, datatype = "Binary", nboot = 0, npermut = 0)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00851543 (tol = 0.002, component 1)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00851543 (tol = 0.002, component 1)
# error

# Likelihood ratio test of random effect
mSS0 <- glmmTMB(SS.bin ~ Tank + Time + Filter, data = data, family =binomial)
anova(mSS0, mSS)
Data: data
Models:
mSS0: SS.bin ~ Tank + Time + Filter, zi=~0, disp=~1
mSS: SS.bin ~ Tank + Time + Filter + Order + (1 | Fish), zi=~0, disp=~1
     Df    AIC    BIC   logLik deviance  Chisq Chi Df Pr(>Chisq)    
mSS0  9 277.42 309.19 -129.710   259.42                             
mSS  13 208.92 254.81  -91.462   182.92 76.496      4  9.614e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(VarCorr(mSS),comp=c("Variance"), digits=2)

Conditional model:
 Groups Name        Variance
 Fish   (Intercept) 8.5     

Error when calculating the repeatability with rptR.

Likelihood ratio test indicates a statistically significant random effect. Meaning that fish behaved differently among each others in terms of stereotypic swimming.

mSS_res <- simulateResiduals(mSS)
plot(mSS_res)

plotResiduals(mSS_res, form = data$Tank)

plotResiduals(mSS_res, form = data$Time)

plotResiduals(mSS_res, form = data$Filter)

# Problem with Filter, I retried the model without Filter
mSS2 <- glmmTMB(SS.bin ~ Tank + Time + (1|Fish), data = data, family =binomial)
plot(simulateResiduals(mSS2))

It is expected to have a problem with filter as there are more than twice more trials without filter than with filter. No problem when removing Filter from the model. The model seems good.

Just a plot with the type of stereotypic swimming depending on the tank. To stick with the analysis, it is a count (count = 1 if the fish performs a certain type of stereotypic swimming during a trial).

It seems that there are not differences between tanks. Just circles seem specific to larger tanks (Medium/Large/Barren).

     Tank          Fish     Time Stereotypic.swimming.type Nb.rep
1   Small        Dragon  7:00 AM                    Pacing    328
2   Small        Dragon 10:00 AM                    Pacing     76
3   Small        Dragon  2:00 PM                    Pacing     20
4   Small        Dragon  6:00 PM                    Pacing      8
5   Small        Dragon  6:00 PM         Pacing front wall      8
6  Barren        Dragon  7:00 AM                    Pacing      4
7  Barren        Dragon 10:00 AM                    Pacing      5
8  Barren        Dragon  2:00 PM                    Pacing     19
9  Barren        Dragon  6:00 PM                    Pacing      4
10    Jar           Elf  7:00 AM                   Zig zag      4
11    Jar           Elf  6:00 PM                   Zig zag      6
12  Small           Elf  7:00 AM                    Pacing     15
13  Small           Elf 10:00 AM                    Pacing     15
14  Small           Elf  2:00 PM                    Pacing    109
15  Small           Elf  6:00 PM                    Pacing     25
16 Medium           Elf  7:00 AM                    Pacing      4
17 Medium           Elf 10:00 AM                    Pacing     34
18 Medium           Elf  2:00 PM                    Pacing     14
19  Large           Elf 10:00 AM                   Circles     12
20  Large           Elf 10:00 AM                    Pacing      8
21 Barren           Elf  7:00 AM                    Pacing    186
22 Barren           Elf 10:00 AM                    Pacing      8
23 Barren           Elf  6:00 PM                    Pacing      8
24  Small         Fairy  7:00 AM                    Pacing    234
25  Small         Fairy 10:00 AM                    Pacing    340
26  Small         Fairy  2:00 PM                    Pacing    381
27  Small         Fairy  6:00 PM                    Pacing    212
28  Large         Fairy 10:00 AM                   Circles      8
29  Large         Fairy 10:00 AM                    Pacing     22
30  Large         Fairy  2:00 PM                    Pacing      7
31 Barren         Fairy  7:00 AM                    Pacing      6
32 Barren         Fairy 10:00 AM                    Pacing    396
33 Barren         Fairy  2:00 PM                    Pacing    132
34 Barren         Fairy  6:00 PM                    Pacing    111
35  Small         Ghost  7:00 AM                    Pacing      3
36  Small        Kraken  7:00 AM                    Pacing      6
37  Small        Kraken  7:00 AM       Pacing with a hover    790
38  Small        Kraken 10:00 AM       Pacing with a hover   1204
39  Small        Kraken 10:00 AM                   Zig zag     77
40  Small        Kraken  2:00 PM       Pacing with a hover    746
41  Small        Kraken  2:00 PM                   Zig zag     64
42 Medium        Kraken 10:00 AM                   Circles     13
43 Medium        Kraken  2:00 PM                   Circles     22
44  Large        Kraken  7:00 AM                   Circles      9
45  Large        Kraken 10:00 AM                    Pacing     77
46  Large        Kraken  2:00 PM                   Circles     43
47  Large        Kraken  2:00 PM                    Pacing     41
48  Large        Kraken  6:00 PM                   Circles      6
49 Barren        Kraken  7:00 AM                   Circles     15
50 Barren        Kraken  7:00 AM                   Zig zag     11
51 Barren        Kraken 10:00 AM                   Circles     41
52 Barren        Kraken 10:00 AM                    Pacing     28
53 Barren        Kraken  2:00 PM                    Pacing     21
54 Barren        Kraken  2:00 PM                   Zig zag     12
55 Barren        Kraken  6:00 PM                   Circles      7
56 Barren        Kraken  6:00 PM                    Pacing      0
57  Small Orange Pendek 10:00 AM                    Pacing      5
58 Medium Orange Pendek  7:00 AM                    Pacing      8
59 Medium Orange Pendek 10:00 AM                    Pacing     41
60 Medium Orange Pendek  2:00 PM                    Pacing      4
61 Medium Orange Pendek  6:00 PM                    Pacing      3
62  Large Orange Pendek  7:00 AM                   Circles     15
63    Jar       Phoenix 10:00 AM                    Pacing      3
64  Small       Phoenix  7:00 AM                    Pacing      7
65  Small       Phoenix  2:00 PM                    Pacing     14
66  Small       Phoenix  6:00 PM                    Pacing      8
67 Medium       Phoenix  7:00 AM                    Pacing     36
68 Medium       Phoenix 10:00 AM                    Pacing     19
69 Medium       Phoenix  2:00 PM                    Pacing    100
70 Medium       Phoenix  6:00 PM                    Pacing     58
71  Large       Phoenix  7:00 AM                    Pacing      6
72  Large       Phoenix 10:00 AM                    Pacing     19
73  Large       Phoenix  2:00 PM                    Pacing     13
74  Large       Phoenix  6:00 PM                    Pacing      8
75 Barren       Phoenix 10:00 AM                   Circles     27
76 Barren       Phoenix 10:00 AM                    Pacing     18
77 Barren       Phoenix  2:00 PM           Pace then hover      0
78 Barren       Phoenix  6:00 PM                   Circles      3
79 Barren       Phoenix  6:00 PM           Pace then hover      0
80    Jar        Wizard  2:00 PM                    Pacing      7
81  Small        Wizard  7:00 AM                    Pacing    681
82  Small        Wizard 10:00 AM                    Pacing      8
83  Small        Wizard  6:00 PM                    Pacing     52
84 Medium        Wizard  7:00 AM                    Pacing     35
85 Medium        Wizard 10:00 AM                    Pacing     16
86  Large        Wizard  7:00 AM                    Pacing     35
87  Large        Wizard 10:00 AM                    Pacing    409
88  Large        Wizard  2:00 PM                    Pacing    277
89  Large        Wizard  6:00 PM                    Pacing    185
90 Barren        Wizard  7:00 AM                    Pacing    527
91 Barren        Wizard 10:00 AM                    Pacing    472
92 Barren        Wizard  2:00 PM                    Pacing    432
93 Barren        Wizard  6:00 PM                    Pacing    641

I don’t think there are enough data to really say something about the number of repetitions. The number of repetitions seems pretty unique to a fish in a certain tank. I don’t see an obvious pattern depending on the tank. The interesting thing is that the stereotypic swimming seemed consistent in a tank for a fish (most or all trials of that day with stereotypic swimming). It is a pitty we couldn’t calculate the repeatability of the binomial glm of Stereotypic Swimming because it would have been interesting.

hist1(data$Interation.with.surface)

table(data$Interation.with.surface==0)

FALSE  TRUE 
  127   125 

Idem analysis binary generalised linear models.

(plotInteracting1 <- plot_var_binary(data, "Interacting.bin"))

(plotInteracting2 <- plot_var_binary_fish(data, "Interacting.bin"))

mInteracting <- glmmTMB(Interacting.bin ~ Tank + Time + Filter + Order + (1|Fish), data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
glmmTMB:::Anova.glmmTMB(mInteracting)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: Interacting.bin
         Chisq Df Pr(>Chisq)   
Tank   17.1634  4   0.001797 **
Time    5.0708  3   0.166684   
Filter  0.0818  1   0.774878   
Order   1.9595  3   0.580856   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size_bin(mInteracting)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1    jar_vs_small       1.78 0.81 Inf    1    1.26   0.412     0.534     5.954
2   jar_vs_medium       0.76 0.41 Inf    1   -0.51   0.612     0.189     3.097
3    jar_vs_large       4.83 2.58 Inf    1    2.95   0.016     1.179    19.760
4 small_vs_medium       0.43 0.22 Inf    1   -1.63   0.310     0.109     1.690
5  small_vs_large       2.71 1.41 Inf    1    1.91   0.224     0.685    10.706
6 medium_vs_large       6.32 3.01 Inf    1    3.87  <0.001     1.796    22.204
contr_tank.enrich_bin(mInteracting)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1 large_vs_barren       0.29 0.13 Inf    1   -2.70   0.007     0.117     0.711

Effect of Tank.

  • Tank. Fish were more likely to interact with the surface during a trial in Jar/Medium compared to Large (not significant for Small). Fish in Jar were x times [CI xx, xx] more likely to interact with the surface during a trial than fish in Large, and Fish in Medium were x times [CI xx, xx] then fish in Large. In addition, fish were xx times [CI xx, xx] more likely to interact with a surface during a trial in Barren than in Large.
mInteracting0 <- glmmTMB(Interacting.bin ~ Tank + Time + Filter + Order, data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
anova(mInteracting0, mInteracting)
Data: data
Models:
mInteracting0: Interacting.bin ~ Tank + Time + Filter + Order, zi=~0, disp=~1
mInteracting: Interacting.bin ~ Tank + Time + Filter + Order + (1 | Fish), zi=~0, disp=~1
              Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
mInteracting0 12 350.58 392.93 -163.29   326.58                             
mInteracting  13 333.41 379.30 -153.71   307.41 19.163      1    1.2e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(VarCorr(mInteracting),comp=c("Variance"), digits=2)

Conditional model:
 Groups Name        Variance
 Fish   (Intercept) 0.96    

Likelihood ratio test indicates a statistically significant random effect. Meaning that fish behaved differently among each others in terms of interaction with surface.

mInteracting_res <- simulateResiduals(mInteracting)
plot(mInteracting_res)

plotResiduals(mInteracting_res, form = data$Tank)

The model seems good.

Just a plot with the type of interaction depending on the tank. To stick with the analysis, it is a count (count = 1 if the fish performs a certain type of interaction with surface during a trial).

The second plot groups similar interactions with surface together.

hist1(data$Nest.building)

table(data$Nest.building==0)

FALSE  TRUE 
   51   201 

There are 1 trial out of 5 where there was nest building. Analysis with binary generalised model.

(plotNest1 <- plot_var_binary(data, "Nest.bin"))

(plotNest2<- plot_var_binary_fish(data, "Nest.bin"))

mNest <- glmmTMB(Nest.bin ~ Tank + Time + Filter + Order + (1|Fish), data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
glmmTMB:::Anova.glmmTMB(mNest)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: Nest.bin
         Chisq Df Pr(>Chisq)    
Tank    8.6944  4  0.0692076 .  
Time    7.3587  3  0.0613014 .  
Filter 11.7652  1  0.0006035 ***
Order   4.6506  3  0.1992406    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size_bin(mNest)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1    jar_vs_small       0.25 0.14 Inf    1   -2.40   0.097     0.052     1.147
2   jar_vs_medium       0.20 0.14 Inf    1   -2.31   0.104     0.033     1.254
3    jar_vs_large       0.23 0.15 Inf    1   -2.23   0.104     0.039     1.310
4 small_vs_medium       0.83 0.49 Inf    1   -0.32  >0.999     0.174     3.943
5  small_vs_large       0.92 0.54 Inf    1   -0.14  >0.999     0.198     4.313
6 medium_vs_large       1.11 0.64 Inf    1    0.19  >0.999     0.244     5.101
contr_tank.enrich_bin(mNest)
         contrast odds.ratio   SE  df null z.ratio p.value asymp.LCL asymp.UCL
1 large_vs_barren       3.60 2.57 Inf    1    1.79   0.073     0.889    14.576
emmeans(mNest, ~ Filter, type = "response")
 Filter      prob     SE  df asymp.LCL asymp.UCL
 No-filter 0.2125 0.0665 Inf   0.11018    0.3702
 Filter    0.0166 0.0133 Inf   0.00338    0.0771

Results are averaged over the levels of: Tank, Time, Order 
Confidence level used: 0.95 
Intervals are back-transformed from the logit scale 

Effect of Filter only.

  • Tank. Almost significant.
  • Filter. Fish were more likely to build a nest without a filter in their tank.
mNest0 <- glmmTMB(Nest.bin ~ Tank + Time + Filter + Order, data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
anova(mNest0, mNest)
Data: data
Models:
mNest0: Nest.bin ~ Tank + Time + Filter + Order, zi=~0, disp=~1
mNest: Nest.bin ~ Tank + Time + Filter + Order + (1 | Fish), zi=~0, disp=~1
       Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
mNest0 12 240.49 282.84 -108.24   216.49                             
mNest  13 229.78 275.67 -101.89   203.78 12.703      1  0.0003652 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(VarCorr(mNest),comp=c("Variance"), digits=2)

Conditional model:
 Groups Name        Variance
 Fish   (Intercept) 1.1     

Likelihood ratio test indicates a statistically significant random effect. Meaning that fish behaved differently among each others in terms of nest building.

mNest_res <- simulateResiduals(mNest)
plot(mNest_res)

plotResiduals(mNest_res, form = data$Tank)

plotResiduals(mNest_res, form = data$Time)

plotResiduals(mNest_res, form = data$Filter)

The model seems good.

To analyse the amount of time the fish was hovering on the 204 trials with hovering time > 0, we tried different models described in “Analysis_Hovering”. The best model was a linear mixed model on the amount of time spent hovering with weighted least squares (WLS). We had to use WLS because the error variance was increasing with the fitted values-higher error variance associated with high hovering time values. It was not possible to use WLS with the lmer package so we had to use the nlme package. The nlme package does not handle confounding levels of two factors - the fact that Tank = Barren is the same as Order = 5. So we had to remove Order from the analysis. The addition of Order was not changing the analysis outcomes in the other models.

hist1(data$Hovering)

table(data$Hovering==0)

FALSE  TRUE 
  204    48 
descdist(data$Hovering, boot = 1000, print = FALSE)

quantile(data$Hovering)
    0%    25%    50%    75%   100% 
  0.00   6.01  23.71  56.21 344.27 
hist1(data$Hovering[data$Hovering!=0]) # histogram without the zeros

Even if there are less zeros than Foraging or Stereotypic Swimming, it is not a normal distribution. There were only 48 trials (out of 252) during which no hovering happened. So 204 trials where we could study the amount of time hovering in 204 trials.

We fitted a glm distribution like Foraging and Stereotypic Swimming on the probability to hover (dependent variable is 1 if hovering occurred during the trial, zero otherwise).

# Effect of tank of the probability to hover during a trial
plot_var_binary(data, "Hovering.bin")

plot_var_binary_fish(data, "Hovering.bin")

# Among the trials where the fish hovers, effect of tank on the amount of time hovering
data %>% filter(Hovering >0) %>% 
  plot_var(., "Hovering")

data %>% filter(Hovering >0) %>% 
  plot_var_fish(., "Hovering")

Both the probability to hover and the time spent hovering seems to be higher in Jar compared to the others. All fish besides Goblin hovered more in Jar compared to Small.

Y variable = probability to hover during a trial

mHovering <- glmmTMB(Hovering.bin ~ Tank + Time + Filter + Order + (1|Fish), data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
glmmTMB:::Anova.glmmTMB(mHovering)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: Hovering.bin
        Chisq Df Pr(>Chisq)  
Tank   6.6046  4    0.15832  
Time   9.4413  3    0.02396 *
Filter 4.9879  1    0.02553 *
Order  1.9380  3    0.58538  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
emmeans(mHovering, ~ Time, type = "response")
 Time      prob     SE  df asymp.LCL asymp.UCL
 7:00 AM  0.918 0.0400 Inf     0.798     0.969
 10:00 AM 0.870 0.0531 Inf     0.727     0.944
 2:00 PM  0.727 0.0803 Inf     0.546     0.855
 6:00 PM  0.746 0.0776 Inf     0.568     0.867

Results are averaged over the levels of: Tank, Filter, Order 
Confidence level used: 0.95 
Intervals are back-transformed from the logit scale 
emmeans(mHovering, ~ Filter, type = "response") 
 Filter     prob     SE  df asymp.LCL asymp.UCL
 No-filter 0.907 0.0338 Inf     0.816     0.955
 Filter    0.712 0.1001 Inf     0.487     0.866

Results are averaged over the levels of: Tank, Time, Order 
Confidence level used: 0.95 
Intervals are back-transformed from the logit scale 
## Random effect
mHovering0 <- glmmTMB(Hovering.bin ~ Tank + Time + Filter + Order, data = data, family =binomial)
dropping columns from rank-deficient conditional model: Order5
anova(mHovering0, mHovering)
Data: data
Models:
mHovering0: Hovering.bin ~ Tank + Time + Filter + Order, zi=~0, disp=~1
mHovering: Hovering.bin ~ Tank + Time + Filter + Order + (1 | Fish), zi=~0, disp=~1
           Df   AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)  
mHovering0 12 245.5 287.86 -110.75    221.5                           
mHovering  13 241.7 287.59 -107.85    215.7 5.7995      1    0.01603 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(VarCorr(mHovering),comp=c("Variance"), digits=2)

Conditional model:
 Groups Name        Variance
 Fish   (Intercept) 0.63    

No effect of Tank on the probability to hover during a trial, only an effect of Time and Filter

  • Time. Fish were more likely to hover in the morning.
  • Filter. Fish were less likely to hover with a filter in their tank.
mHovering_res <- simulateResiduals(mHovering)
plot(mHovering_res)

The model seems good.

Y variable = amount of time spent hovering during a trial. Only trials with hovering time > 0

mHovering2 <- lme(Hovering ~ Tank + Time + Filter, random = ~1 | Fish, data = data[data$Hovering>0, ], weights = varPower(), control = lmeControl(maxIter = 20000))
anova(mHovering2)
            numDF denDF  F-value p-value
(Intercept)     1   183 43.85489  <.0001
Tank            4   183 18.75147  <.0001
Time            3   183  3.66785  0.0134
Filter          1   183  5.29132  0.0226
contr_tank.size(mHovering2)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1    jar_vs_small       53 12.0 183    4.40  <0.001       21       85
2   jar_vs_medium       53 12.0 183    4.42  <0.001       21       85
3    jar_vs_large       61 11.9 183    5.10  <0.001       29       93
4 small_vs_medium        1  6.5 183    0.08   0.938      -17       18
5  small_vs_large        8  6.2 183    1.30   0.394       -9       25
6 medium_vs_large        8  2.7 183    2.81   0.016        0       15
contr_tank.enrich(mHovering2)
         contrast estimate  SE  df t.ratio p.value lower.CL upper.CL
1 large_vs_barren      -22 4.9 183   -4.53  <0.001      -32      -12
emmeans(mHovering2, ~ Time)
 Time     emmean   SE df lower.CL upper.CL
 7:00 AM    39.3 5.06 12     28.2     50.3
 10:00 AM   48.3 5.72 12     35.8     60.8
 2:00 PM    48.1 5.68 12     35.7     60.4
 6:00 PM    42.4 5.23 12     31.0     53.8

Results are averaged over the levels of: Tank, Filter 
Degrees-of-freedom method: containment 
Confidence level used: 0.95 
emmeans(mHovering2, ~ Filter)
 Filter    emmean   SE df lower.CL upper.CL
 No-filter   52.2 5.30 12     40.7     63.8
 Filter      36.8 6.53 12     22.5     51.0

Results are averaged over the levels of: Tank, Time 
Degrees-of-freedom method: containment 
Confidence level used: 0.95 
# no random effect estimate possible with the weighted least squares

Effect of Tank, Time and Filter.

  • Tank. Jar hovered significantly more time than Small, Medium and Large. Fish in Jar hovered 53 sec [CI 20, 85] more than Small (over a 600 sec trial), 53 sec [CI 20, 86] more than Medium, and 61 sec [CI 28, 93] more than Large. Fish in medium hovered more time than fish in Large. In addition, fish in Barren hovered significantly more time (22 sec [CI 12, 35]) than fish in Large.
  • Time. Fish hovered more time in the middle of the day.
  • Filter. Fish hovered less time with a filter in their tank.
plot(mHovering2, resid(., type = "pearson") ~ fitted(.), aspect = 1, pch = 21, abline = 0)

plot(mHovering2, abs(resid(., type = "pearson")) ~ fitted(.), aspect = 1, pch = 21, abline = 0)

qqnorm(resid(mHovering2, type = "pearson"))
qqline(resid(mHovering2, type = "pearson"))

ranefPlot <- ranef(mHovering2, level = 1)
qqnorm(ranefPlot$`(Intercept)`)
qqline(ranefPlot$`(Intercept)`)

The model seems good besides the normality. But linear models are robust to the normality assumption (= we can still draw true inference).

Just a plot with where the fish hovers depending on the tank.

Fish used different places for hovering in all tanks.

ggplot(dataUpDown, aes(y = adj.up, x = Tank)) +
    geom_jitter(alpha = .5, width = 0.05, height=0, color = grey(0.25)) +
    stat_summary(fun.y = mean,geom = "point",colour = "red", size = 4)+
    stat_summary(fun.min = function(x) mean(x) - sd(x), 
                 fun.max = function(x) mean(x) + sd(x), 
                 geom = "errorbar",colour = "red", width = 0.15)+
  ylab("Time up (sec)")

It seems that the tank has an influence on whether fish prefer being on the upper half or lower half of the tank.

We can do different models: + Look at the percentage of time spent up of the trial. Fit a beta distribution. Problem: All percentages need to be 0 < perc < 100%. There are trials with 100% so we can’t use beta distribution.

  • Look at the count of seconds (how many seconds the fish spend up of the tank). Fit a binomial distribution. Problem: The diagnostics plots were very bad when fitting mUp <- glmmTMB(cbind(up, down) ~ Tank + Time + Filter + (1|Fish), data = dataUpDown, family = binomial)

  • Look simply at the actual time spent up. Fit a normal distribution. The response variable can be adjusted by the fact that the trial were not perfectly 600 sec. \(adj.up= \frac{\text{time spent up} * 600}{\text{trial total duration}}\). It is what has been done.

mUp <- lmer(adj.up ~ Tank + Time + Filter + Order + (1|Fish), data = dataUpDown)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
anova(mUp, ddf = "Kenward-Roger", type = 2)
Type II Analysis of Variance Table with Kenward-Roger's method
       Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
Tank   251386   62846     4 234.07  3.8215 0.0049776 ** 
Time   284189   94730     3 228.06  5.7613 0.0008159 ***
Filter  87414   87414     1 130.01  5.3164 0.0227095 *  
Order   70637   23546     3 230.78  1.4320 0.2341679    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
contr_tank.size(mUp)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1    jar_vs_small      -71 26.0 229   -2.72   0.036     -140       -1
2   jar_vs_medium       33 29.9 240    1.10   0.667      -47      112
3    jar_vs_large        2 29.4 240    0.06   0.953      -76       80
4 small_vs_medium      103 29.1 240    3.55   0.003       26      181
5  small_vs_large       72 28.9 240    2.50   0.052       -5      149
6 medium_vs_large      -31 25.4 228   -1.22   0.667      -99       36
contr_tank.enrich(mUp)
         contrast estimate   SE  df t.ratio p.value lower.CL upper.CL
1 large_vs_barren      -25 25.9 229   -0.97   0.335      -76       26
emmeans(mUp, ~ Time) # estimated means of time spent swimming depending on the time of the day 
 Time     emmean   SE   df lower.CL upper.CL
 7:00 AM     274 24.1 30.6      225      323
 10:00 AM    279 24.1 30.6      230      329
 2:00 PM     224 24.1 30.6      175      273
 6:00 PM     200 24.1 30.6      151      249

Results are averaged over the levels of: Tank, Filter, Order 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
emmeans(mUp, ~ Filter) # estimated means of time spent swimming depending on the time of the day
 Filter    emmean   SE   df lower.CL upper.CL
 No-filter    280 20.6 15.4      237      324
 Filter       208 28.9 35.1      150      267

Results are averaged over the levels of: Tank, Time, Order 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

Effect of Tank, Time and Filter.

  • Tank. Statistically significant difference between Small and Medium/Jar: Fish in Small tank spent on average 103 more seconds [95% CI 26, 181] on the upper half of the tank compared to fish in Medium tanks. At the opposite, Fish in Small tank spent on average 71 more seconds [95% CI 26, 181] on the lower half of the tank compared to fish in Jar tanks.
  • Time. Fish spent more time on the upper half of the tank in the morning.
  • Filter. Fish spent more time on the upper half of the tank with a filter in their tank.
lm_diagnosis(mUp)
[1] "check constant variance over fitted values"

[1] "Normality of adj.up"

[1] "Normality random residuals"

The model seems good.

sessionInfo()
R version 4.3.3 (2024-02-29 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Australia/Sydney
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] renv_1.0.7          nlme_3.1-164        performance_0.11.0 
 [4] rptR_0.9.22         DHARMa_0.4.6        glmmTMB_1.1.9      
 [7] fitdistrplus_1.1-11 survival_3.5-8      MASS_7.3-60.0.1    
[10] emmeans_1.10.1      lmerTest_3.1-3      lme4_1.1-35.2      
[13] Matrix_1.6-5        FactoMineR_2.10     RColorBrewer_1.1-3 
[16] ggforce_0.4.2       ggrepel_0.9.5       ggpubr_0.6.0       
[19] lubridate_1.9.3     forcats_1.0.0       stringr_1.5.1      
[22] dplyr_1.1.3         purrr_1.0.2         readr_2.1.5        
[25] tidyr_1.3.1         tibble_3.2.1        ggplot2_3.5.0      
[28] tidyverse_2.0.0    

loaded via a namespace (and not attached):
 [1] Rdpack_2.6           rlang_1.1.1          magrittr_2.0.3      
 [4] compiler_4.3.3       mgcv_1.9-1           vctrs_0.6.3         
 [7] pkgconfig_2.0.3      fastmap_1.1.1        backports_1.4.1     
[10] labeling_0.4.3       utf8_1.2.3           promises_1.3.0      
[13] rmarkdown_2.26       tzdb_0.4.0           nloptr_2.0.3        
[16] xfun_0.43            jsonlite_1.8.8       flashClust_1.01-2   
[19] later_1.3.2          tweenr_2.0.3         broom_1.0.5         
[22] parallel_4.3.3       cluster_2.1.6        R6_2.5.1            
[25] gap.datasets_0.0.6   qgam_1.3.4           stringi_1.8.3       
[28] car_3.1-2            boot_1.3-29          numDeriv_2016.8-1.1 
[31] estimability_1.5     iterators_1.0.14     Rcpp_1.0.11         
[34] knitr_1.46           httpuv_1.6.15        splines_4.3.3       
[37] timechange_0.3.0     tidyselect_1.2.1     rstudioapi_0.16.0   
[40] abind_1.4-5          yaml_2.3.8           codetools_0.2-19    
[43] doParallel_1.0.17    TMB_1.9.11           plyr_1.8.9          
[46] lattice_0.22-5       shiny_1.8.1.1        withr_3.0.0         
[49] evaluate_0.23        polyclip_1.10-6      pillar_1.9.0        
[52] gap_1.5-3            carData_3.0-5        foreach_1.5.2       
[55] DT_0.33              insight_0.19.10      generics_0.1.3      
[58] hms_1.1.3            munsell_0.5.1        scales_1.3.0        
[61] minqa_1.2.6          xtable_1.8-4         leaps_3.1           
[64] glue_1.6.2           scatterplot3d_0.3-44 tools_4.3.3         
[67] see_0.8.3            ggsignif_0.6.4       mvtnorm_1.2-3       
[70] grid_4.3.3           rbibutils_2.2.16     colorspace_2.1-0    
[73] patchwork_1.2.0      cli_3.6.1            fansi_1.0.4         
[76] gtable_0.3.4         rstatix_0.7.2        digest_0.6.35       
[79] pbkrtest_0.5.2       htmlwidgets_1.6.4    farver_2.1.1        
[82] htmltools_0.5.8.1    lifecycle_1.0.4      multcompView_0.1-10 
[85] mime_0.12